翻訳と辞書 |
Calgary corpus : ウィキペディア英語版 | Calgary corpus
The Calgary corpus is a collection of text and binary data files, commonly used for comparing data compression algorithms. It was created by Ian Witten, Tim Bell and John Cleary from the University of Calgary in 1987 and was commonly used in the 1990s. In 1997 it was replaced by the Canterbury corpus, based on concerns about how representative the Calgary corpus was, but the Calgary corpus still exists for comparison and is still useful for its originally intended purpose. == Contents ==
In its most commonly used form, the corpus consists of 14 files totaling 3,141,622 bytes as follows. There is also a less commonly used 18 file version which include 4 additional text files in UNIX "troff" format, PAPER3 through PAPER6.
抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Calgary corpus」の詳細全文を読む
スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース |
Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.
|
|